refactor(memory): Improve Chat Memory logic and efficiency #4065
+127
−26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Title:
Refactor(memory): Improve logic and efficiency of Chat Memory
Body:
Closes: #3945
This pull request addresses the performance overhead in MessageWindowChatMemory by refactoring its internal logic and abstracting the storage layer. The previous implementation replaced the entire message list on every update, which is inefficient for long-running conversations.
Key Changes
Introduced ChatMemoryRepository: A new interface is introduced to abstract the storage mechanism, decoupling the chat memory logic from the in-memory implementation. This design allows for future persistent implementations (e.g., Redis, JDBC).
Efficient InMemoryChatMemoryRepository: The default InMemoryChatMemoryRepository now uses an efficient refresh method to apply only deltas (additions/deletions) instead of replacing the entire list, significantly reducing overhead.
Updated MessageWindowChatMemory: The class is refactored to use the new ChatMemoryRepository, delegating all state mutations to the repository layer.
Updated Tests: Comprehensive unit tests have been added and updated to ensure the reliability and correctness of the new implementation.